Molecular Screening of Bioactive Compounds of Garlic for Therapeutic Effects against COVID-19

An outbreak of pneumonia occurred on December 2019 in Wuhan, China, which caused a serious public health emergency by spreading around the globe. Globally, natural products are being focused on more than synthetic ones. So, keeping that in view, the current study was conducted to discover potential antiviral compounds from Allium sativum. Twenty-five phytocompounds of this plant were selected from the literature and databases including 3-(Allylsulphinyl)-L-alanine, Allicin, Diallyl sulfide, Diallyl disulfide, Diallyl trisulfide, Glutathione, L-Cysteine, S-allyl-mercapto-glutathione, Quercetin, Myricetin, Thiocysteine, Gamma-glutamyl-Lcysteine, Gamma-glutamylallyl-cysteine, Fructan, Lauricacid, Linoleicacid, Allixin, Ajoene, Diazinon Kaempferol, Levamisole, Caffeicacid, Ethyl linoleate, Scutellarein, and S-allylcysteine methyl-ester. Virtual screening of these selected ligands was carried out against drug target 3CL protease by CB-dock. Pharmacokinetic and pharmacodynamic properties defined the final destiny of compounds as drug or non-drug molecules. The best five compounds screened were Allicin, Diallyl Sulfide, Diallyl Disulfide, Diallyl Trisulfide, Ajoene, and Levamisole, which showed themselves as hit compounds. Further refining by screening filters represented Levamisole as a lead compound. All the interaction visualization analysis studies were performed using the PyMol molecular visualization tool and LigPlot+. Conclusively, Levamisole was screened as a likely antiviral compound which might be a drug candidate to treat SARS-CoV-2 in the future. Nevertheless, further research needs to be carried out to study their potential medicinal use.


Introduction
An outbreak of pneumonia pandemic occurred in China, in December 2019, which caused a serious public health emergency by spreading around the globe. Finally, it was officially announced on 9 January 2020 that the outbreak in Wuhan is caused by a novel coronavirus 2019-nCoV. On 11 March, COVID-19 was declared a pandemic disease by the WHO, as it is easily transferable from one to another human. Globally, nearly 1.9 million new cases and over 12,000 deaths were reported in the week of 16 to 22 January 2023 [1]. The unique coronavirus was named Severe Respiratory Disease [2]. Coronaviruses (CoVs) are responsible for causing infection in humans as well as in animals, and they cause several other diseases related to respiratory issues. These spreader viruses are grouped into alpha, beta, gamma, and delta variants, with a new Omicron variant appearing recently [3].
The pandemic which occurred in the period 2002-2003, in China and Asia Pacific regions, was caused by SARS-CoV and infected more than 8000 people around the globe, with a 10% mortality rate. Fever, cough, and lowering of oxygen level in blood were the

Primary Sequence Retrieval
The primary sequence of target proteins was taken in FASTA format from the protein sequence database UniProt (http://www.uniprot.org accessed on 12 February 2023 (PRJNA318322)) [17].

Analysis of Physiochemical Properties
This was to determine the function of proteins. ProtParam was used to predict the physicochemical parameters of SARS-CoV-2 protein including molecular weight, number of amino acids, isoelectric point, instability index, and grand average of hydropathicity (GRAVY). ProtParam tool of ExPASy was used to determine the negatively charged residues (Asp + Glu), positively charged residues (Arg + Lys), aliphatic index, and atomic composition [18].

Identification of Functional Domains
Interpro (https://www.ebi.ac.uk/interpro/D1/D344/5958491 accessed on 12 February 2023) was used to detect and predict the functional domain of targeted protein. Conserved domains are involved in sequence/structure/relationship study. InterPro provides a practical analysis of proteins by classifying them into families and predicting domains and active sites [19].

Active Site Identification
The ligand shows maximum or highest interaction with the protein where the target protein has its active site. Amino acids are highly involved in the formation of a complex of ligands to protein. Protein binding pockets were identified by CASTp [19].

Ligand Preparation
The 3-dimensional (3D) structure of ligands was obtained from PubChem. PubChem is the world's largest collection of freely available chemical information. We can search several ligands by their names, molecular formula, and structure and by other information. If the targeted structure is not available in PubChem, then can be drawn via ChemDraw by inserting Canonical smileys derived from PubChem [16].

Bioactivity Analysis of Ligands and Toxicity Measurement
Selected ligands from the PubChem database should follow the Lipinski rule of five, having required chemical and physical properties. This was performed by using PkCSM (https://omictools.com/pkcsmtool/database/id/1618 accessed on 12 February 2023) [18].

Molecular Docking Process
The purpose of molecular docking is to find the best conformational interaction between target proteins and compounds. Molecular docking of protein and ligands was performed through Cavity detection-guided Blind Docking (CB-Dock) [20].

Visualization of Ligand/Protein
Docked complex of ligands and protein was visualized by PyMol. Docking poses generated via CB-Dock were visualized and saved as a molecule in .pdb form in one file for further analysis [18].

Analysis of Docked Complex
Analysis of docked complex was performed by using LigPlot+, which automatically generates schematic diagrams of protein-ligand interactions for a given PDB file. These interactions are modified by hydrogen bonds and through hydrophobic contact [18].

Ligand ADMET Properties
The main aim of predicting ADMET is to choose strong candidates by eliminating weak drug candidates in the early stages of drug development. Optimization of the ADMET (Absorption, Distribution, Metabolism, Excretion, and Toxicity) properties of the drug molecule was performed by using PkCSM [21].

Active Inhibitor Identification
After a detailed analysis of protein and ligand interactions, docking scores, and toxicity studies, the most active inhibitor was identified. The selected compound was our lead compound [19].

FDA-Approved Drug-Proposed Antiviral Agent Comparison
Finally, the comparison was made between the selected antiviral drug Remdesivir and the proposed antiviral agents by comparing all the parameters described above [22].

Target Proteins Structure and Properties
The primary sequence of the target protein (3CL Protease) was taken in FASTA format from the UniProt database under accession numbers P0DTD1 with 7096 residues length. The 3D structure of 3CLpro of SARS-CoV-2 was obtained from Protein Data Bank (PDB ID: 6M2Q) in .pdb format [16]. Physiochemical properties of 3CL protease were determined by ProtParam under accession No. [A0A6C0M8P6-SARS2]. In physicochemical parameters of selected protein 3CLpro of SARS-CoV-2, Mol. weight, atomic composition, isoelectric point, no. of amino acids, instability index, grand average of hydropathicity (GRAVY), No. of negatively charged residues (Asp + Glu), No. of positively charged residues (Arg + Lys), Aliphatic index, and amino acid and atomic composition were included, and these properties were investigated using the ProtParam ExPASy tool. InterPro (https: //www.ebi.ac.uk/interpro/14231 accessed on 12 February 2023) was used to identify active domains of 3CLpro domains [16]. The selected target protein is shown in Figure 1.
SARS-CoV-2 is the virus which is responsible to cause COVID-19 and up till now there is no proper treatment for this pandemic which has affected the world. To know about the virus, it is compulsory to obtain information about the structure of the involved virus. So, this structure ( Figure 1) can be understood in a better way. Accordingly, 3CL protease comes from the class of highly conserved viruses, and it is now the target of broad-spectrum antiviral drugs which kill the virus as it is the site of replication of the virus [23]. In the early studies on the SARS-CoV-2 models, Mpro shows a close relation to the main proteases named coronaviral in terms of structure: 99% of the amino acid structure is common to bat CoV RaTG13 Mpro and 97% similar to SARS CoV Mpro [24].
The functional analysis of protein sequences was obtained by Interpro to determine the conserved and functional domain and sequence/structure/relationship. More than one functional domain can be there to perform different functions [19]. These conserved domains and families of the target protein are shown in Figure 2. Figure 3 shows functional domains and pockets present in red color along with the structure of the protein. Moreover, Table 1 shows the area and volume of these pockets which were obtained by using CASTp software. SARS-CoV-2 is the virus which is responsible to cause COVID-19 and up till now there is no proper treatment for this pandemic which has affected the world. To know about the virus, it is compulsory to obtain information about the structure of the involved virus. So, this structure ( Figure 1) can be understood in a better way. Accordingly, 3CL protease comes from the class of highly conserved viruses, and it is now the target of broad-spectrum antiviral drugs which kill the virus as it is the site of replication of the virus [23]. In the early studies on the SARS-CoV-2 models, Mpro shows a close relation to the main proteases named coronaviral in terms of structure: 99% of the amino acid structure is common to bat CoV RaTG13 Mpro and 97% similar to SARS CoV Mpro [24].
The functional analysis of protein sequences was obtained by Interpro to determine the conserved and functional domain and sequence/structure/relationship. More than one functional domain can be there to perform different functions [19]. These conserved domains and families of the target protein are shown in Figure 2.   Table 1 shows the area and volume of these pockets which were obtained by using CASTp software.    Table 1 shows the area and volume of these pockets which were obtained by using CASTp software.
The 3D structure of the target proteins and the ligands was taken as the input for docking, which was performed by the CB dock [21]. The CB dock gave possible poses with receptor models, and among these poses, the best one was selected by observing certain properties such as vena score, size of cavity, etc. [19]. The CB Dock also projected the predictable binding site for protein and premeditated centers and sizes with an innovative rotation cavity detection method and performed docking with the popular docking program known as Auto dock Vina [19]. So, the obtained data is given in Table 2, which shows the minimum and maximum energy, cavity size, binding score, and grid map of ligands. LigPlot+ (version v.1.4.5) and PyMol Edu (v1.7.4.5) were used for analyzing docking results. LigPlot+ (version v.1.4.5) also determined the interactions of ligands and target proteins [26]. The graphical system of LigPlot+ automatically generates multiple 2D diagrams of interactions from 3D coordinates. The 2D diagrams of the best binding score ligands with respective proteins were obtained from LigPlot+, shown in Figure 4A-P. As evident from the 2D diagram, ligands show only hydrophobic interactions with the protein.

ADMET Properties of Ligands
Lipinski's five drug laws, when applied, served as a first filter in assessing the drug likelihood of the selected ligands. In our thesis, 25 different ligands were taken, and when filtered by different software, few were left. So, when Lipinski's rule of five was applied, Myricetin, Fructan, Linoleic Acid, Ethyl Linoleate, Glutathione, and S-Allyl-Mercapto-Glutathione were knocked out as shown in Table 3.     Further, ligands were screened by calculating the ADMET (absorption, distribution, metabolism, excretion, and toxicity) properties as a measure of pharmacokinetics using the online tool PkCSM [29]. ADMET properties are shown in Tables 4-8, respectively.

Lead Compounds Identification
Myricetin, Fructan, Linoleic Acid, Ethyl Linoleate, Glutathione, and S-Allyl-Mercapto-Glutathione were knocked out from Lipinski's rule of five. Linoleic Acid and Ethyl Linoleate have a logPS value > −2. So, the lig can be identified as lead compounds. The best five compounds were Allicin, Diallyl Sulfide, Diallyl Disulfide, Diallyl Trisulfide, Ajoene, and Levamisole. The lead compound of this research work was Levamisole, as is also indicated by molecular docking [20].

Drug Identification against COVID-19
With the emergence of the disease, many FDA-approved drugs were utilized for drug repurposing, finding the best treatment against the virus. One of the drugs that has been in use in different countries such as the UK, Brazil, India, Pakistan, and many more is Remdesivir. Though the use of this medicine has been increased during this whole pandemic, this drug is still in clinical trials [22]. The first FDA-approved drug to treat SARS-CoV-2 is Remdesivir, which is an antiviral nucleotide analogue prodrug [22]. Because of its broad-spectrum nature and mechanism of action against various viral families, it is suggested to the patients. This medicine is a non-obligate chain terminator of RdRp from SARS-CoV-2 and the related SARS-CoV and MERS-CoV, and it has been investigated and suggested in many different clinical trials against COVID-19 [30].

Reference Drug ADMET Properties
The drug ADMET properties were studied by using the same software as above, which is PkCSM. Table 9 shows the absorption properties of Remdesivir. The values show that Remdesivir shows a very low CaCO2 solubility and water solubility. Though intestinal absorption is high, it is still in the safe range. Remdesivir also has a lower value of skin permeability. Remdesivir is also a P-glycoprotein substrate and an inhibitor of P-glycoprotein I but not a P-glycoprotein II inhibitor.  Table 10 shows the distribution properties of Remdesivir. The distribution parameters value shows that the value of VDss is low, which means the drug would not be distributed properly. Remdesivir can penetrate in CNS and also can pass the blood-brain barrier.  Table 11 shows the metabolic properties of Remdesivir. It indicates that Remdesivir is not a CYP2D6 substrate; rather, it is a CYP3A4 substrate. With those, Table 11 shows that Remdesivir is not a CYP1A2, CYP2C19, CYP2C9, CYP2D6, and CYP3A4 inhibitor.  Table 12 shows the excretion properties of Remdesivir. The above table gives the values of the Excretory properties of Remdesivir. It shows that Remdesivir is not a renal OCT2 substrate, which means it will not help in clearing the drug. With that, the value of total clearance as 0.198 is also given with respect to its liver.  Table 13 shows the Toxicity Properties of Remdesivir. The toxicity parameters value of Remdesivir shows that this drug can be toxic towards the liver, but other parameters are in the range of positive values. This indicates that Remdesivir can cause any skin sensitivity, and it also is not an inhibitor of hERG I but an hERG II inhibitor. The dose value of 0.291 is also tolerable. With that, a no to AMES toxicity indicates that it is not carcinogenic. Table 14 shows the docking result of Remdesivir. The table indicates that Remdesivir has a binding score of −8.1. The docking results of Remdesivir show that it has quite a good binding score. Additionally, has four hydrogen bond donors and thirteen hydrogen bond acceptors that break two of Lipinski's rules, as the molecular weight is above 500 g/mol.

Remdesivir Comparison with Lead Compound
The standard drug Remdesivir was compared with the lead compound Levamisole and its physicochemical and pharmacokinetic properties. Table 15 shows that Remdesivir breaks two of Lipinski's rules relating to molecular weight and H-bond acceptor: the molecular weight of Remdesivir is 602.585, which is greater than the 500 allowed according to Lipinski, and the H-bond acceptor of Remdesivir accepts 13 hydrogens, but according to Lipinski, it should not be more than 10; in contrast, Levamisole follows all rules of LogP, Molecular weight, H-bond acceptor, and H bond donor according to Lipinski.

ADMET Properties Comparison
The ADMET properties comparison was performed to check the absorption, distribution, metabolic excretion, and toxicity properties of the drug and the lead compound, in order to find a better drug candidate.

Absorption Properties Comparison
The parameter of absorption is based on 6 models. The water solubility model gives the value of the compound's solubility in the water at 25‰. A model of CaCO2 solubility is used to detect the absorption of the drug. Values greater than 0.90 are considered to have high intestinal absorption, which means Levamisole is absorbed more than Remdesivir. The value of the intestinal absorption model is less than 30%, which means the drug is not well absorbed. The given values of both the standard and lead compound show that Levamisole has high intestinal absorption.
For the transdermal drugs skin permeability model, a value less than log Kp > −2.5 is considered low; according to this, both compounds pass the skin permeability test. The P-glycoprotein substrate model is very important, as P-glycoprotein is an ABC transporter. Both Levamisole and Remdesivir act as substrates. The last model of P-glycoprotein inhibitors shows whether the compound is an inhibitor or not [31]. Table 16 shows that Levamisole is an inhibitor of P-glycoprotein II, whereas Remdesivir is the inhibitor of P-glycoprotein I.

Metabolic Properties Comparison
Cytochrome P450 is mainly found in the liver and is held responsible for oxidizing the xenobiotic so that they can be excreted easily out from the body, hence making cytochrome P450 a detoxification enzyme. Some drugs are activated by it, and some are deactivated [32]. Table 17 shows that Remdesivir is a CYP3A4 substrate, and Levamisole is a CYP3A4 substrate and CYP2D6 inhibitor.  Table 18 shows the comparative distribution properties of Remdesivir and Levamisole. The distribution parameter is based on 4 models. The volume of distribution (VDss) is a uniform distribution of the drug in the blood plasma, and if this value is above 2.81 L/kg, then the drug is distributed more in the tissues rather than in the blood plasma. Both Remdesivir and Levamisole have a reasonable VDss value. The 2nd model is based upon the unbound fraction of the drugs in the plasma, as bounded drugs affect the efficiency of the drugs. The given value is the amount of the drug which remains unbounded [33]. For BBB permeability, if the value is greater than 0.3 logBB, then that drug can easily cross the blood-brain barriers, and if the value is less than −1 logBB, then the drug does not properly reach the brain [34]. By these values, it is clear that Remdesivir has a low value; hence, it would be poorly distributed to the brain. Similarly, the model for CNS is based on the values that if the logPS > −2, then that drug can easily penetrate the CNS, while those having value of logPS < −3 are unable to reach the CNS. Remdesivir has a low value, and hence, it will not cross and reach the CNS.

Excretion Properties Comparison
Levamisole has more total clearance than Remdesivir. The 2nd model is of the Renal OCT2 (organic cation transporter 2), and this transporter helps in renal clearance. Being an OCT2 substrate, it can show an adverse effect in correlation with inhibitors [35]. So, both Remdesivir and Levamisole are not Renal OCT2 substrates. Table 19 shows the values of excretory properties of Remdesivir and Levamisole.

Toxicity Comparison
The toxicity of both the standard drug and lead compound was based on nine models. Model 1 of AMES toxicity shows that both the standard and lead compounds are not mutagenic. Model 2 of the maximum tolerated dose shows that if the value is equal to or less than 0.477 log mg/kg/day, then it is considered low, and greater values are considered high. Table 15 below shows that Levamisole has a low value of the tolerated dose. The 3rd model is of hERG I and II inhibitors, where only levamisole is an inhibitor of both, while remdesivir inhibits only II inhibitors. The 4th model of oral rat acute toxicity is used to assess the relative toxicity. Model 5 of oral rat chronic toxicity gives the values of the lowest dose that could result in an adverse effect [34].
Model 6 of hepatotoxicity shows that the drug can cause damage to the liver. Table 20 shows that Remdesivir is hepatotoxic. For the dermal products model, Model 7 is used for checking the sensitivity towards the skin. Both the standard and lead compounds are not sensitive to skin. Model 8 uses T. pyriformis, and Model 9 uses minnows to check the toxicity [36]. For T. pyriformis, a value > −0.5 is considered toxic, according to which Remdesivir is somewhat toxic, and minnow toxicity values below 0.5mM are considered toxic, and both compounds pass this toxicity test. Table 20 shows the comparative values of toxicity of Remdesivir and Levamisole.

Physiochemical Properties Comparison
For determining the fundamental properties of the compounds, physiochemical properties were studied. This screening shows that Remdesivir has 27 carbon atoms, 35 hydrogen atoms, 6 nitrogen atoms, 8 oxygen atoms, and a phosphorous atom, whereas Levamisole has 11 carbon atoms, 12 hydrogen atoms, 2 nitrogen atoms, and Sulphur. Remdesivir can donate 4 hydrogen atoms, whereas Levamisole cannot donate hydrogen.
Remdesivir can accept 13 Hydrogen atoms which do not fall under the Lipinski rule. Although the Log P value of Remdesivir is more than that of Levamisole, the molecular weight of Remdesivir is far greater than Levamisole, and also, it does not fall under the Lipinski rule. Table 21 shows the comparison of the physiochemical properties of Remdesivir and Levamisole.

Docking Score Comparison
Both the standard and the lead compound were docked, and the docking result gives us the best binding score. Table 22 shows that the lead compound Levamisole, which has a much higher Vina score than that of the standard drug, which is Remdesivir. The binding score of Remdesivir is −8.1 and that for Levamisole is −5.7, which is higher. This result shows that Levamisole can block the 3CL pro or bind with it more efficiently than can Remdesivir.

Conclusions
The motive of the present research was to discover potential antiviral components from Allium sativum. Twenty-five phytocompounds (which represent almost all classes of natural antiviral compounds) were selected from the literature and databases. Molecular docking was performed by CB-dock, an online tool against 3CL protease of COVID-19 and the five best-scoring phytocompounds were identified as hit compounds. Physicochemical and pharmacokinetic properties determined the final destiny of compounds as drug or non-drug compounds. Levamisole was predicted as a lead compound by virtual screening results. As per the results of this research, the lead compound, Levamisole, can be explored as an important candidate to cure viral infections, especially COVID-19. These potential antiviral compounds of Allium sativum can also be tested for the pharmaceutical and medical industries.  Data Availability Statement: All data generated or analyzed during this study are included in this article.